Discovering Substantial Distinctions among Incremental Bi-Clusters
نویسندگان
چکیده
A fundamental task of data analysis is comprehending what distinguishes clusters found within the data. We present the problem of mining distinguishing sets which seeks to find sets of objects or attributes that induce that most change among the incremental bi-clusters of a binary dataset. Unlike emerging patterns and contrast sets which only focus on statistical differences between support of itemsets, our approach considers distinctions in both the attribute space and the object space. Viewing the lattice of bi-clusters formed within a data set as a weighted directed graph, we mine the most significant distinguishing sets by growing a maximal cost spanning tree of the lattice. In this paper we present a weighting function for measuring distinction among bi-clusters in the lattice and the novel MIDS algorithm. MIDS simultaneously enumerates biclusters, constructs the bi-cluster lattice, and computes the distinguishing sets. The efficient computational performance of MIDS is exhibited in a performance test on real world and benchmark data sets. The utility of distinguishing sets is also demonstrated with experiments on synthetic and real data.
منابع مشابه
Distributed and Cooperative Compressive Sensing Recovery Algorithm for Wireless Sensor Networks with Bi-directional Incremental Topology
Recently, the problem of compressive sensing (CS) has attracted lots of attention in the area of signal processing. So, much of the research in this field is being carried out in this issue. One of the applications where CS could be used is wireless sensor networks (WSNs). The structure of WSNs consists of many low power wireless sensors. This requires that any improved algorithm for this appli...
متن کاملLinear Coherent Bi-cluster Discovery via Line Detection and Sample Majority Voting
Discovering groups of genes that share common expression profiles is an important problem in DNA microarray analysis. Unfortunately, standard bi-clustering algorithms often fail to retrieve common expression groups because (1) genes only exhibit similar behaviors over a subset of conditions, and (2) genes may participate in more than one functional process and therefore belong to multiple group...
متن کاملDiscovering and Analyzing the Intellectual Structure and Its Evolution in Core Journals of "Knowledge and Information Science" during 2004-2013
Purpose: This study aims to reveal the intellectual structure of Knowledge and Information Science and its evolution along with the review of journals subjective scope based on 6830 abstract in the ten core journal in the JCR 2013, over the ten years (2004-2013). Methodology: In this research, co-word and Correspondence analysis of 150 words -selected by tf-idf weight- were done after parametri...
متن کاملA Hybrid Meta-Heuristic Method to Optimize Bi-Objective Single Period Newsboy Problem with Fuzzy Cost and Incremental Discount
In this paper the real-world occurrence of the multiple-product multiple-constraint single period newsboy problem with two objectives, in which there is incremental discounts on the purchasing prices, is investigated. The constraints are the warehouse capacity and the batch forms of the order placements. The first objective of this problem is to find the order quantities such that the expected ...
متن کاملA cue-based approach to 'theory of mind': re-examining the notion of automaticity.
The potential utility of a distinction between 'automatic (or spontaneous) and implicit' versus 'controlled and explicit' processes in theory of mind (ToM) is undercut by the fact that the terms can be employed to describe different but related distinctions within cognitive systems serving that function. These include distinctions in the underlying cognitive systems, processes, or representatio...
متن کامل